import matplotlib.pyplot as plt
import plotly.express as px
import pandas as pdResearch Introduction: AI vs. Non-AI Careers
Introduction
Artificial Intelligence (Al) is rapidly transforming industries, reshaping job roles, and redefining employment patterns. While AI-driven automation raises concerns about job displacement, it also fosters job creation in emerging fields, As AI continues to integrate into various sectors in 2024 and future, understanding its impact on job security is essential. This study examines whether Ar is primarily displacing jobs or generating new opportunities, highlighting industries experiencing AI-driven growth versus those facing employment risks.
Research Rationale
Recent studies indicate that AI is transforming the job market by automating roles in manufacturing, customer service, and administrative work, leading to job displacement. At the same time, it is driving demand for skilled professionals in AI development, cybersecurity, and data analysis, creating new career opportunities(Perrault and Clark ((2024))). As AI continues to evolve, workers will need to adapt by developing new skills, making AI-focused education and training increasingly essential. Although AI will displace certain tasks in manufacturing and administrative work it will also lead to the evolution of existing jobs. This has been something that has happened in the past with the evolution of technology and although it led to job displacement it also led to revolutionary strides in the efficiency of work. This in turn leads to economic and job growth which the masses should welcome and not turn away from (George ((2024))). Further not all jobs are replaceable, but they would greatly benefit from shifting to include some level of AI including supply chain, medicine, and manufacturing. With the correct training the transition to include AI will be critical to the shift the world will see as a result. This research aims to explore how AI is reshaping employment, balancing job losses with new job creation. The findings will provide valuable insights for businesses, policymakers, and workers, helping them navigate Al-driven changes and prepare for the future workforce.
Brief Literature Review
Existing research highlights the growing divide between AI-driven careers and traditional industries.
Perrault and Clark ((2024)) In 2023, AI-related job postings accounted for 1.6% of all listings in the United States from 2.8% in 2022, This decline is linked to fewer openings from top Al firms and aCOwn.reduced emphasis on tech roles within these companies.
George ((2024)) The world has had multiple market evolutions including the Industrial Revolution which led to shifts in the machinery used and decreased the need for as many employees working in the fields. The number of employees working in agriculture decreased from 70% of the developed world in the mid-19th cenutry to 2-3% today. This led to an increase in people working in manufacturing to maintain the technology used in agriculture. The world would likely see a similar outcome once AI starts to present in more industries.
Trends and Insights From The Lightcast Dataset
Data Collection & Cleaning
Download a sample of the lightcast dataset from google drive. Load the CSV file into a pandas dataframe.
df = pd.read_csv('data/lightcast_job_postings.csv')Then, we carry out some data cleaning steps to remove unrelated items and properly deal with replace missing data. Here, we focus on full-time in-office non-internship roles.
# remove duplicated rows
df = df.drop_duplicates(subset='ID')
# convert date columns to datetime format
df['LAST_UPDATED_DATE'] = pd.to_datetime(df['LAST_UPDATED_DATE'])
df['LAST_UPDATED_TIMESTAMP'] = pd.to_datetime(df['LAST_UPDATED_TIMESTAMP'])
df['POSTED'] = pd.to_datetime(df['POSTED'])
df['EXPIRED'] = pd.to_datetime(df['EXPIRED'])
# remove internship
df = df[df['IS_INTERNSHIP'] == False]
# remove non-full time jobs
df = df[df['EMPLOYMENT_TYPE_NAME'] == 'Full-time (> 32 hours)']
# remove remote/hybrid jobs, we want to focus on in-person jobs
df = df[df['REMOTE_TYPE_NAME'] == '[None]']
# missing salary data
df['SALARY'] = df['SALARY'].fillna((df['SALARY_FROM'] + df['SALARY_TO']) / 2) # mean if lower/upper bound avaiable
df['SALARY'] = df['SALARY'].fillna(df['SALARY_FROM']) # loewr bound if no upper bound presented
df['SALARY'] = df['SALARY'].fillna(df['SALARY_TO']) # upper bound if no lower bound presentedIdentifying AI and Non-AI Jobs
We identify AI/non-AI jobs by keyword searching.
keywords = ['AI', 'Artificial Intelligence', 'Machine Learning', 'Deep Learning',
'Data Science', 'Data Analysis', 'Data Analyst', 'Data Analytics',
'LLM', 'Language Model', 'NLP', 'Natural Language Processing',
'Computer Vision']
match = lambda col: df[col].str.contains('|'.join(keywords), case=False, na=False)
df['AI_JOB'] = match('TITLE_NAME') \
| match('SKILLS_NAME') \
| match('SPECIALIZED_SKILLS_NAME') \
| match('LIGHTCAST_SECTORS_NAME')Trend 1: AI vs. Non-AI Job Across Industries
We first compare the number of AI and Non-AI jobs across all industries.
df_grouped = df.groupby(['AI_JOB', 'NAICS2_NAME']).size().reset_index(name='Job_Count')
px.bar(df_grouped, x='NAICS2_NAME', y='Job_Count', color='AI_JOB',
title="AI vs. Non-AI Job Distribution Across Industries",
labels={'NAICS2_NAME': 'Industry', 'Job_Count': 'Number of Jobs'},
barmode='group')As we can see, there are almost always more AI jobs than non-AI jobs.
Trend 2: AI-driven Job Growth vs. Job Displacement
Now we turn out eyes to job growth and job displacement.
df_grouped = df.groupby(['POSTED', 'AI_JOB']).size().reset_index(name='Job_Count')
px.line(df_grouped, x='POSTED', y='Job_Count', color='AI_JOB',
title="AI vs. Non-AI Job Posted Over Time",
labels={'POSTED_MONTH': 'Month', 'Job_Count': 'Number of Job Postings'},
markers=True)Overall, there are more newly posted AI jobs than non-AI jobs.
df_grouped = df.groupby(['AI_JOB', 'MIN_YEARS_EXPERIENCE']).size().reset_index(name='Job_Count')
fig = px.bar(df_grouped, x='MIN_YEARS_EXPERIENCE', y='Job_Count', color='AI_JOB',
title="AI vs. Non-AI Jobs by Minimum Years of Experience",
labels={'MIN_YEARS_EXPERIENCE': 'Min Years of Experience', 'Job_Count': 'Number of Jobs'},
barmode='group')
fig.show()
df_grouped = df[df['SALARY'].notna()].groupby(['AI_JOB', 'MIN_YEARS_EXPERIENCE'])['SALARY'].mean().reset_index()
fig = px.bar(df_grouped, x=['AI_JOB', 'MIN_YEARS_EXPERIENCE'], y='SALARY',
title="Average Salary: AI vs. Non-AI Jobs",
labels={'AI_JOB': 'Job Type', 'SALARY': 'Average Salary'},
color='AI_JOB')
fig.show()Further investigate into the required YoE of the posted jobs, one can see that AI jobs generally require less experience than traditional non-AI ones. In other words, AI related industry demands more junior roles and less senior roles compared to the traditional industry
Trend 3: Skill in AI Roles vs. Traditional Roles
As we have seen, nearly every industry now requires a significant number of AI roles. One might interest in what are the valuable skills workers should develop to land in an AI job?
import ast
df['SKILLS'] = df['SKILLS_NAME'].apply(ast.literal_eval)
ai_skills = df[df['AI_JOB']]['SKILLS'].explode().value_counts().head(20).reset_index()
ai_skills.columns = ['Skill', 'Count']
fig = px.bar(ai_skills, x='Skill', y='Count',
title="Top AI Job Skills",
labels={'Skill': 'Skill Name', 'Count': 'Frequency'},
color='Skill')
fig.show()
traditional_skills = df[~df['AI_JOB']]['SKILLS'].explode().value_counts().head(20).reset_index()
traditional_skills.columns = ['Skill', 'Count']
fig = px.bar(traditional_skills, x='Skill', y='Count',
title="Top Traditional Job Skills",
labels={'Skill': 'Skill Name', 'Count': 'Frequency'},
color='Skill')
fig.show()As we can see, data sciene skills e.g. (SQL, data analysis, and programming) are favorable in AI related jobs. In the contrary, traditional jobs value candidates with strong communication and management skills.
Trend 4: Emerging AI Job
While many traditional jobs are in the process of adopting AI-related technologies. There are many AI jobs emerging.
We note that Data Analyst is the fastest growing AI job.
job_titles = df[df['AI_JOB']]['TITLE_NAME'].value_counts().head(20).reset_index()
job_titles.columns = ['Job_Title', 'Count']
px.bar(job_titles, x='Job_Title', y='Count',
title="Top Emerging AI Job Titles",
labels={'Job_Title': 'Job Title', 'Count': 'Frequency'},
color='Job_Title')